BSBC: Towards a Succinct Data Format for XML Streams

نویسندگان

  • Stefan Böttcher
  • Rita Hartel
  • Christian Heinzemann
چکیده

XML data compression is an important feature in XML data exchange, particularly when the data size may cause bottlenecks or when bandwidth and energy consumption limitations require reducing the amount of the exchanged XML data. However, applications based on XML data streams also require efficient path query processing on the structure of compressed XML data streams. We present a succinct representation of XML data streams, called Bit-Stream-Based-Compression (BSBC) that fulfills these requirements and additionally provides a compression ratio that is significantly better than that of other queriable XML compression techniques, i.e. XGrind and DTD subtraction, and that of non-queriable compression techniques like gzip. Finally, we present an empirical evaluation comparing BSBC with these compression techniques and with XMill that demonstrates the benefits of BSBC.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Grammar-based Approach for Compressing XML

XML is a popular meta-language in widespread use across a variety of application domains. However, its verbose nature has limited its acceptance in cases where a more succinct textual or binary data encoding format can be used. In this report, we describe AXECHOP, an XML-conscious compressor which uses a grammarbased approach to exploit the possibly significant structural redundancies within XM...

متن کامل

XML Binary Serialization using Cross-Format Schema Protocol (XFSP) and XML Compression Considerations for Extensible 3D (X3D) Graphics

The NPS Cross-Format Schema Protocol (XFSP) has been developed as a general approach to binary serialization of XML documents. Elements and attributes are replaced via a tokenization scheme which carefully preserves valid XML document structure. XFSP uses XML schema as the basis for determining key document parameters such as legal elements, attributes and data types. Originally motivated by th...

متن کامل

FluXQuery: An Optimizing XQuery Processor for Streaming XML Data

XML has established itself as the ubiquitous format for data exchange on the Internet. An imminent development is that of streams of XML data being exchanged and queried. Data management scenarios where XQuery [11] is evaluated on XML streams are becoming increasingly important and realistic, e.g. in e-commerce settings. Naturally, query engines employed for stream processing are main-memory-ba...

متن کامل

Space-efficient Data Structures for Collections of Textual Data

This thesis focuses on the design of succinct and compressed data structures for collections of string-based data, specifically sequences of semi-structured documents in textual format, sets of strings, and sequences of strings. The study of such collections is motivated by a large number of applications both in theory and practice. For textual semi-structured data, we introduce the concept of ...

متن کامل

NEXMark – A Benchmark for Queries over Data Streams DRAFT

A lot of research has focused recently on executing queries over data streams. This recent attention is due to the large number of data streams available, and the desire to get answers to queries on these streams in real time. There are many sources of data streams: environmental sensor networks, network routing traces, financial transactions and cell phone call records. Many systems are curren...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008